MEDB 5501, Module09

2024-10-15

Topics to be covered

  • What you will learn
    • Restructuring your data
    • One-tailed tests
    • Checking assumptions
    • Alternative tests
    • R code for alternative tests
    • Paired data
    • R code for paired data
    • Your homework

Two different data structures

  • Wide format
    • Group 1 in first column
    • Group 2 in second column
  • Long format
    • Group 1 in first \(n_1\) rows
    • Group 2 in remaining rows
    • Could be different order or mixed
    • Additional column for group

Hypothetical data in the wide format

# A tibble: 4 × 2
    trt   pbo
  <dbl> <dbl>
1    15    34
2    13    31
3    18    36
4    19    NA

Hypothetical data in the long format

# A tibble: 8 × 3
     id result exposed
  <int>  <dbl> <chr>  
1     1     45 n      
2     2     43 n      
3     3     48 n      
4     4     49 n      
5     1     64 y      
6     2     61 y      
7     3     66 y      
8     4     62 y      

Pivoting from wide form to long form

wide_form |>
  pivot_longer( 
    cols=c("trt", "pbo"), 
    names_to="intervention",
    values_to="outcome")
# A tibble: 8 × 2
  intervention outcome
  <chr>          <dbl>
1 trt               15
2 pbo               34
3 trt               13
4 pbo               31
5 trt               18
6 pbo               36
7 trt               19
8 pbo               NA

Pivoting from long form to wide form

long_form |>
  pivot_wider( 
    names_from=exposed,
    values_from=result)
# A tibble: 4 × 3
     id     n     y
  <int> <dbl> <dbl>
1     1    45    64
2     2    43    61
3     3    48    66
4     4    49    62

Break #1

  • What you have learned
    • Restructuring your data
  • What’s coming next
    • One-tailed tests

One-tailed tests

  • \(H_0:\ \mu_1 - \mu_2 = 0\)
  • \(H_1:\ \mu_1 - \mu_2 > 0\)
    • Accept \(H_0\) if \(\bar{X}_1-\bar{X}_2\) is large negative
    • Accept \(H_0\) if \(\bar{X}_1-\bar{X}_2\) is close to zero
    • Reject \(H_0\) if \(\bar{X}_1-\bar{X}_2\) is large positive
  • More precisely,
    • Accept \(H_0\) if \(T \le t(1-\alpha, df)\)
    • Reject \(H_0\) if \(T \gt t(1-\alpha, df)\)

Critical value for a one tailed test

One-tailed tests

  • \(H_0:\ \mu_1 - \mu_2 = 0\)
  • \(H_1:\ \mu_1 - \mu_2 < 0\)
    • Reject \(H_0\) if \(\bar{X}_1-\bar{X}_2\) is large negative
    • Accept \(H_0\) if \(\bar{X}_1-\bar{X}_2\) is close to zero
    • Accept \(H_0\) if \(\bar{X}_1-\bar{X}_2\) is large positive
  • More precisely,
    • Accept \(H_0\) if \(T \ge t(\alpha, df)\)
    • Reject \(H_0\) if \(T \lt t(\alpha, df)\)

Critical value

Calculation of the p-value

  • For testing \(H_1:\ \mu_1=\mu_2 \ne 0\)
    • p-value = \(2 P[t(n_1+n_2-2) > |T|]\)
  • For testing \(H_1:\ \mu_1=\mu_2 > 0\)
    • p-value = \(P[t(n_1+n_2-2) > T]\)
  • For testing \(H_1:\ \mu_1=\mu_2 < 0\)
    • p-value = \(P[t(n_1+n_2-2) < T]\)
  • For any of these hypothesis,
    • Accept \(H_0\) if p-value > \(\alpha\)

R code for one-sided hypotheses

  • For testing \(H_1:\ \mu_1=\mu_2 \ne 0\)
    • alternative="two-sided"
  • For testing \(H_1:\ \mu_1=\mu_2 > 0\)
    • alternative="greater"
  • For testing \(H_1:\ \mu_1=\mu_2 < 0\)
    • alternative="less"

Break #2

  • What you have learned
    • One-tailed tests
  • What’s coming next
    • Checking assumptions

Population model for the two-sample t-test

  • Population 1
    • \(X_{11},X_{12},...,X_{1N_1}\)
    • \(X_{1i}\) are independent \(N(\mu_1,\sigma_1)\)
  • Population 2
    • \(X_{21},X_{22},...,X_{2N_2}\)
    • \(X_{2i}\) are independent \(N(\mu_2,\sigma_2)\)
  • Both populations independent from each other
  • Both populations have the same standard deviation

Violations of these population model

  • Non-normality
  • Heterogeneity
  • Lack of independence
    • Within each group
    • Between groups

Checking for non-normality

  • Boxplot
    • Might be sufficient by itself
  • Non-normality is less concerning with large sample sizes
  • Residual analysis
    • Normal probability plot
    • Histogram
    • Always residuals, never the original data

Checking for heterogeneity

  • Boxplot
  • Compare group standard deviations
    • 3 to 1 or higher ratio
  • Is the data unbalanced (\(n_1 \ne n_2\))

Checking for independence

  • Qualitative assessments only
    • Clustering
    • Geographic proximity

Consequences of violations

  • Loss of control of Type I error rate
    • Two possibilities
      • Liberal, actual alpha much larger than 0.05
      • Conservative, actual alpha much smaller than 0.05
  • Possible increase of Type II error rate = loss of power
    • Especially if there are outliers
  • Confidence intervals too wide or too narrow

Break #3

  • What you have learned
    • Checking assumptions
  • What’s coming next
    • Alternative tests

Alternative tests

  • If you have heterogeneity
    • Welch’s test
  • If you have non-normality
    • Mann-Whitney-Wilcoxon test
  • If you have both:
    • Log transformation
  • If you have lack of independence
    • Random effects models
      • Beyond the scope of this class

Welch’s test, 1

  • \(T = \frac{\bar{X}_1-\bar{X}_2}{se}\)
  • se = standard error changes slightly
    • \(se = \sqrt{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}}\)
  • df = degrees of freedom changes slightly
    • \(df=\frac{\big(\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}\big)^2}{\frac{s_1^4}{(n_1-1)n_1^2}+\frac{s_2^4}{(n_2-1)n_2^2}}\)
      • Also known as Satterthwaite’s approximation

Welch’s test, 2

  • R code
    • var.equal=FALSE

Controversy over the use of Welch’s test, 1

  • Some advocate using it all the time.
    • Welch’s test is valid for homogeneity OR heterogeneity
  • Some advocate using a preliminary hypothesis
    • $H_0: _1=_2
      • Levene’s test
      • Bartlett’s test
      • Brown-Forsythe test

Controversy over the use of Welch’s test, 2

  • My opinion:
    • Avoid ANY test of heterogeneity
    • Make decision based on prior experience

Mann-Whitney-Wilcoxon test

Henry Mann, PhD in Mathematics in 1935

Donald Ransom Whitney, PhD in Mathematics in 1946

Frank Wilcoxon, PhD in Chemistry, 1924

Other names

  • Wilcoxon-Mann-Whitney
  • Mann-Whitney U test
  • Wilcoxon rank sum test
    • Not to be confused with Wilcoxon signed rank test

Theory behind the Mann-Whitney-Wilcoxon test

  • Combine the two groups
  • Assign ranks \(R(X_{ij})\)
    • 1 to smallest value, 2 to second smallest, etc.
  • Compute average rank, \(\bar{R}\)
  • Compute the sum of the ranks in first group, \(\Sigma R(X_{1j})\)
  • T = \(\frac{\Sigma R(X_{1j})- n_1 \bar{R}}{se}\)
    • se = \(\sqrt{\frac{n_1 n_2(n_1+n_2+1)}{12}}\)

Alternate theory

  • Count the times that \(X_{1j}\) “wins” compared to all the \(X_2\)’s
    • \(X_{1j}\) wins if it is larger than \(X_{2k}\)
    • Count ties as 0.5 (half of a “win”)
    • There are \(n_1 n_2\) contests
    • U = number of wins
    • T = \(\frac{U-\frac{n_1 n_2}{2}}{se}\)
    • Same se as earlier slide

Hypothetical data

-   X1: 34, 1695, 1193
-   X2: 652, 11, 24, 16, 1543, 39

Rank sum for hypothetical data

Group  Outcome  Rank  
    1       34     4
    1     1695     9
    1     1193     7
    2      652     6
    2       11     1
    2       24     3
    2       16     2
    2     1543     8
    2       39     5
  • \(\Sigma R(X_{1j})\) = 4 + 9 + 7 = 20
  • Expected sum = 3*5 = 15

Counting “wins” for hypothetical data

       652   11   24   16 1543   39
  34  Loss  Win  Win  Win Loss Loss
1695   Win  Win  Win  Win  Win  Win
1193   Win  Win  Win  Win Loss  Win
  • U = 14 (out of 18 possible)
  • Expected wins = 9

The probability basis for ranking

List all the possible outcomes, 1

1,2,3  1,2,4  1,2,5  1,2,6  1,2,7  1,2,8  1,2,9
1,3,4  1,3,5  1,3,6  1,3,7  1,3,8  1,3,9  1,4,5
1,4,6  1,4,7  1,4,8  1,4,9  1,5,6  1,5,7  1,5,8
1,5,9  1,6,7  1,6,8  1,6,9  1,7,8  1,7,9  1,8,9
2,3,4  2,3,5  2,3,6  2,3,7  2,3,8  2,3,9  2,4,5
2,4,6  2,4,7  2,4,8  2,4,9  2,5,6  2,5,7  2,5,8
2,5,9  2,6,7  2,6,8  2,6,9  2,7,8  2,7,9  2,8,9
3,4,5  3,4,6  3,4,7  3,4,8  3,4,9  3,5,6  3,5,7
3,5,8  3,5,9  3,6,7  3,6,8  3,6,9  3,7,8  3,7,9
3,8,9  4,5,6  4,5,7  4,5,8  4,5,9  4,6,7  4,6,8
4,6,9  4,7,8  4,7,9  4,8,9  5,6,7  5,6,8  5,6,9
5,7,8  5,7,9  5,8,9  6,7,8  6,7,9  6,8,9  7,8,9

List all the possible outcomes, 2

 6: 1,2,3
 7: 1,2,4
 8: 1,2,5  1,3,4
 9: 1,2,6  1,3,5  2,3,4
10: 1,2,7  1,3,6  1,4,5  2,3,5
11: 1,2,8  1,3,7  1,4,6  2,3,6  2,4,5
12: 1,2,9  1,3,8  1,4,7  1,5,6  2,3,7  2,4,6  3,4,5
13: 1,3,9  1,4,8  1,5,7  2,3,8  2,4,7  2,5,6  3,4,6
14: 1,4,9  1,5,8  1,6,7  2,3,9  2,4,8  2,5,7  3,4,7  3,5,6

List all the possible outcomes, 3

15: 1,5,9  1,6,8  2,4,9  2,5,8  2,6,7  3,4,8  3,5,7  4,5,6
16: 1,6,9  1,7,8  2,5,9  2,6,8  3,4,9  3,5,8  3,6,7  4,5,7
17: 1,7,9  2,6,9  2,7,8  3,5,9  3,6,8  4,5,8  4,6,7
18: 1,8,9  2,7,9  3,6,9  3,7,8  4,5,9  4,6,8  5,6,7
19: 2,8,9  3,7,9  4,6,9  4,7,8  5,6,8
20: 3,8,9  4,7,9  5,6,9  5,7,8
21: 4,8,9  5,7,9  6,7,8
22: 5,8,9  6,7,9
23: 6,8,9
24: 7,8,9

Controversy over the use of nonparametric tests

  • Unclear what your hypothesis is
    • Inequality in population means
    • Inequality in population medians
    • \(P[X > Y] \ne \frac{1}{2}\)
  • Loss of power
    • Multiply t-test sample size by pi/3.

Log transformation

  • Arithmetic mean of original data
    • \(\frac{1}{n}\Sigma X_i\)
  • Arithmetic mean of log transformed data
    • \(\frac{1}{n}\Sigma log(X_i)\)
    • \(\frac{1}{n} log(X_1 \times X_2 \times ... \times X_n)\)
  • Arithmetic mean of log transformed data back transformed
    • \((X_1 \times X_2 \times\ ...\ \times X_n)^\frac{1}{n}\)
    • This is known as the geometric mean

Break #4

  • What you have learned
    • Alternative tests
  • What’s coming next
    • R code for alternative tests

postural-sway data dictionary, 1

data_dictionary: postural-sway.txt
copyright: |
  The author of the jse article holds the copyright, but does not list conditions under which it can be used. Individual use for educational purposes is probably permitted under the Fair Use provisions of U.S. Copyright laws.
description: |
  Postural sway is a measure of how well patients can balance. The postural sway was measured using a force plate in two groups of subjects, elderly or young. Sway was measured in the forward/back direction and in the side-to-side direction.
additional_description: https://gksmyth.github.io/ozdasl/general/balaconc.html

postural-sway data dictionary, 2

download_url: https://gksmyth.github.io/ozdasl/general/balaconc.txt
format: tab delimited
varnames: Included in the first line
missing_value_code: not needed
size:
  rows: 17
  columns: 3

postural-sway data dictionary, 3

Age:
  scale: nominal
  values:
  - Elderly
  - Young
FBSway:
  label: Sway in the forward-backward direction
  scale: ratio
  range: positive real
SideSway:
  label: Sway in the side-to-side direction
  scale: ratio
  range: positive real

simon-5501-09-sway.qmd, 1

---
title: "Alternative analysis of postural sway data"
format: 
  html:
    embed-resources: true
---

This program runs some alternatives to the two-sample t-test. Consult the [data dictionary][dic] for information about the data itself.

[dic]: https://github.com/pmean/data/blob/main/files/postural-sway.yaml

This program was written by Steve Simon on 2024-10-13. It is placed in the public domain.

simon-5501-09-sway.qmd, 2

## Libraries

```{r setup}
#| message: false
#| warning: false
library(broom)
library(tidyverse)
```

simon-5501-09-sway.qmd, 3

## Read data

```{r read-memory}
sway <- read_tsv(
  file="../data/postural-sway.txt",
  col_types="cnn")
names(sway) <- tolower(names(sway))
glimpse(sway)
```

simon-5501-09-sway.qmd, 4

## Boxplot of front-to-back sway by age

```{r boxplot-1}
#| fig.height: 2
#| fig.width: 6
sway |>
  ggplot(aes(age, fbsway)) +
    geom_boxplot() +
    ggtitle("Graph drawn by Steve Simon on 2024-10-13") +
    xlab("Treatment group") +
    ylab("Front to back sway") +
    coord_flip()
```

The outlier causes some concern about the validity of the two-sample t-test.

simon-5501-09-sway.qmd, 5

## Descriptive statistics for front-to-back sway by age

```{r group-means}
sway |>
  group_by(age) |>
  summarize(
    fb_mn=mean(fbsway),
    fb_sd=sd(fbsway),
    n=n())
```

In addition to the outlier, notice that the group with the larger mean (elderly) has the larger standard deviation. This indicates that a log transformation may produce better results.

simon-5501-09-sway.qmd, 6

## Log transformation, 1

```{r log-transform-1}
sway |>
  mutate(log_fbsway=log10(fbsway)) -> log_sway
```

simon-5501-09-sway.qmd, 7

## Log transformation, 2

```{r log-transform-2}
#| fig.height: 2
#| fig.width: 6
log_sway |>
  ggplot(aes(age, log_fbsway)) +
    geom_boxplot() +
    ggtitle("Graph drawn by Steve Simon on 2024-10-13") +
    xlab("Treatment group") +
    ylab("Front to back sway") +
    coord_flip()
```

Notice that the highest sway value in the elderly patients is much closer to the rest of the data.

simon-5501-09-sway.qmd, 8

## Log transformation, 3

```{r compare-means-on-log-scale}
log_sway |>
  group_by(age) |>
  summarize(
    log_mn=mean(log_fbsway),
    log_sd=sd(log_fbsway),
    n=n())
```

The standard deviations on the log scale are quite a bit more similar than they were on the original scale.

simon-5501-09-sway.qmd, 9

## Two-sample t-test using the log transformation

```{r t-test}
m2 <- t.test(
  log_fbsway ~ age, 
  data=log_sway,
  alternative="two.sided",
  var.equal=TRUE)
m2
```

There is a statistically significant difference between the log front-to-back sway between elderly patients and young patients. The confidence interval will be interpreted after transforming back to the original scale of measurement.

simon-5501-09-sway.qmd, 10

## Back-transform confidence interval to the original scale.

```{r back-transform}
10^(m2$conf.int)
```

We are 95% confident that the geometric mean front-to-back sway in elderly patients is somewhere between 1.24 times higher and 8 times higher than the geometric mean for young patients. This indicates a statistically significantly higher mean for elderly patients. The confidence interval is still very wide, indicating a lot of sampling error in these two small samples.

simon-5501-09-sway.qmd, 11

## Wilcoxon-Mann-Whitney

```{r wmw-test}
wilcox.test(fbsway ~ age, data=sway)
```

Since the p-value is small, you would reject the null hypothesis and conclude that there is a statistically significant difference in front-to-back sway values between elderly and young patients.

simon-5501-09-sway.qmd, 12

## Paired t-test

The research team is also interested in whether the front-to-back sway is larger than the side-to-side sway, combining both the elderly and young patients into one group.

simon-5501-09-sway.qmd, 13

## Compute and graph differences

```{r compute-difference-1}
sway |>
  mutate(diff_sway=fbsway-sidesway) -> paired_differences
paired_differences |>
    summarize(
      diff_mn=mean(diff_sway),
      diff_sd=sd(diff_sway))
```

The average difference is positive, indicating that front-to-back-sways are larger on average. The standard deviation of the differences is large. 

simon-5501-09-sway.qmd, 14

## Boxplot of differences

```{r compute-difference-2}
#| fig.height: 1.5
#| fig.width: 6
paired_differences |>
  ggplot(aes(diff_sway, "Combined")) +
    geom_boxplot() +
    ggtitle("Graph drawn by Steve Simon on 2024-10-13") +
    xlab("Front-to-back sway minus side-to-side sway") +
    ylab(" ")
```

More than 75% of the differences are positive, also indicating that front-to-back sways tend to be larger.

simon-5501-09-sway.qmd, 15

## Paired t-test

```{r paired-t-test}
t.test(
  sway$fbsway, 
  sway$sidesway,
  paired=TRUE,
  alternative="two.sided")
```
The p-value is small, indicating that front-to-back sways are significantly larger than side-to-side sways. The 95% confidence intervals shows that the mean difference is at least 0.21 units larger for front-to-back sway and possibly as large as 7.0 units.

Break #5

  • What you have learned
    • R code for alternative tests
  • What’s coming next
    • Paired data

Paired data

Break #6

  • What you have learned
    • Paired data
  • What’s coming next
    • R code for paired data

Programming template

Break #7

  • What you have learned
    • R code for paired data
  • What’s coming next
    • Your homework

Your homework

Summary

  • What you have learned
    • Restructuring your data
    • One-tailed tests
    • Checking assumptions
    • Alternative tests
    • R code for alternative tests
    • Paired data
    • R code for paired data
    • Your homework